Conversation
WalkthroughPorts changed from 5000 → 3000 in Dockerfile, compose.yaml, and example.env. scripts/scrape_soc.py was rewritten to run semester scrapes concurrently using ThreadPoolExecutor, adding Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant Main as main()
participant Pool as ThreadPoolExecutor (max_workers=8)
participant Worker as _scrape_one(s, term)
participant SOC as SOC API
participant FS as File System
User->>Main: run scrape_soc.py
Main->>Main: build semester tasks
Main->>Pool: submit tasks
par concurrent semesters
Pool->>Worker: execute _scrape_one
Worker->>SOC: fetch semester data
SOC-->>Worker: data / error
alt success
Worker->>FS: write soc_scraped_<s>.json
Worker-->>Pool: (s, True, None)
else failure
Worker-->>Pool: (s, False, "error msg")
end
end
loop as tasks complete
Pool-->>Main: results via as_completed
Main->>Main: update counts
end
Main->>User: print summary (total, successes, failures, elapsed)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 💡 Knowledge Base configuration:
You can enable these sources in your CodeRabbit configuration. 📒 Files selected for processing (1)
🔇 Additional comments (1)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
Dockerfile (1)
12-14: Bind Gunicorn to PORT env with sane default.Keeps 3000 by default while allowing overrides in different deploy targets.
-CMD ["gunicorn", "--bind", "0.0.0.0:3000", "app:app"] +CMD ["sh", "-c", "gunicorn --bind 0.0.0.0:${PORT:-3000} app:app"] -EXPOSE 3000 +EXPOSE 3000scripts/scrape_soc.py (1)
145-166: Return a proper exit code and right-size worker count.Improves CI usability and avoids oversubscribing when fewer semesters exist.
def main(): - WORKERS = 8 - items = list(semesters.items()) + items = list(semesters.items()) + WORKERS = min(8, len(items)) results: list[tuple[str, bool, str | None]] = [] @@ - if failed: + if failed: print("Failed semesters:") for s, _, err in failed: print(f" - {s}: {err}") + return 0 if not failed else 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
Dockerfile(1 hunks)compose.yaml(1 hunks)scripts/scrape_soc.py(2 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
scripts/scrape_soc.py
138-138: Consider moving this statement to an else block
(TRY300)
139-139: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (1)
scripts/scrape_soc.py (1)
9-9: LGTM: concurrent futures import.
Summary by CodeRabbit
New Features
Chores